Mining Optimized Association Rules with Categorical and Numeric Attributes

نویسندگان

  • Rajeev Rastogi
  • Kyuseok Shim
چکیده

ÐMining association rules on large data sets has received considerable attention in recent years. Association rules are useful for determining correlations between attributes of a relation and have applications in marketing, financial, and retail sectors. Furthermore, optimized association rules are an effective way to focus on the most interesting characteristics involving certain attributes. Optimized association rules are permitted to contain uninstantiated attributes and the problem is to determine instantiations such that either the support or confidence of the rule is maximized. In this paper, we generalize the optimized association rules problem in three ways: 1) association rules are allowed to contain disjunctions over uninstantiated attributes, 2) association rules are permitted to contain an arbitrary number of uninstantiated attributes, and 3) uninstantiated attributes can be either categorical or numeric. Our generalized association rules enable us to extract more useful information about seasonal and local patterns involving multiple attributes. We present effective techniques for pruning the search space when computing optimized association rules for both categorical and numeric attributes. Finally, we report the results of our experiments that indicate that our pruning algorithms are efficient for a large number of uninstantiated attributes, disjunctions, and values in the domain of the attributes.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mining Optimized Support Rules for Numeric Attributes

Mining association rules on large data sets have received considerable attention in recent years. Association rules are useful for determining correlations between attributes of a relation and have applications in marketing, financial and retail sectors. Furthermore, optimized association rules are an effective way to focus on the most interesting characteristics involving certain attributes. O...

متن کامل

Interestingness-Based Interval Merger for Numeric Association Rules

We present aa algorithm for mining association rules from relational tables containing numeric and categorical attributes. The approach is to merge adjacent intervals of numeric values, in a bottom-up manner, on the basis of maximizing the interestingness of a set of association rules. A modification of the B-tree is adopted for performing this task efficiently. The algorithm takes O(kN) I/O ti...

متن کامل

Multi-objective Numeric Association Rules Mining via Ant Colony Optimization for Continuous Domains without Specifying Minimum Support and Minimum Confidence

Currently, all search algorithms which use discretization of numeric attributes for numeric association rule mining, work in the way that the original distribution of the numeric attributes will be lost. This issue leads to loss of information, so that the association rules which are generated through this process are not precise and accurate. Based on this fact, algorithms which can natively h...

متن کامل

Knowledge Discovery from Health Data Using Weighted Aggregation Classifiers

Introduction. The automatic construction of classifiers is an important research problem in data mining, since it provides not only a good prediction but provides also a characterization of a given data in the form easily understood by a human. A decision tree [4] is a classifier widely used in real applications, which are easy to understand, and efficiently constructed by using a method based ...

متن کامل

Constructing Efficient Decision Trees by Using Optimized Numeric Association Rules

1 Introduction We propose an extension of an entropy-based heuristic of Quinlan [Q93] for constructing a decision tree from a large database with many numeric attributes. Quin-lan pointed out that his original method (as well as other existing methods) may be inefficient if any numeric attributes are strongly correlated. Our approach offers one solution to this problem. For each pair of numeric...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998